The dynamics of health in the British Household Panel Survey

Authors


Abstract

This paper considers the dynamics of a categorical indicator of self-assessed health using eight waves (1991–1998) of the British Household Panel Survey (BHPS). Our analysis has three focal points: the relative contributions of state dependence and heterogeneity in explaining the dynamics of health, the existence and consequences of health-related sample attrition, and the investigation of the effects of measures of socioeconomic status, with a particular focus on educational attainment and income. To investigate these issues we use dynamic panel ordered probit models. There is clear evidence of health-related attrition in the data but this does not distort the estimates of state dependence and of the socioeconomic gradient in health. The models show strong positive state dependence and heterogeneity accounts for around 30% of the unexplained variation in health. Copyright © 2004 John Wiley & Sons, Ltd.

1. INTRODUCTION

This paper considers the dynamics of self-assessed health (SAH) using eight waves (1991/92–1998/99) of the British Household Panel Survey (BHPS). The main objective of this paper is to analyse the dynamics of individual health. This is of interest because of the persistence in health outcomes revealed by the BHPS data. The categorical measure of SAH has five possible responses: excellent, good, fair, poor and very poor. To illustrate the evidence of persistence, consider a multinomial distribution with constant probabilities of being in each state in every period. This can be viewed as a random baseline model. Assume that the probability that an individual reports excellent health in a given period is 0.277 (the approximate proportion of males reporting excellent self-assessed health in the balanced sample of data), then for our working sample of 2780 men, we would expect (0.2778) × 2780 = 0.0964 of them to report excellent health in all periods. This implies that, on average, a sample of over (1/0.0964) × 2780 = 28,838 would be required to observe one individual reporting excellent health in every period. By contrast, in our working sample, we observe 101 men who always report excellent health. Analogous calculations for other health states lead to similar comparisons. Similarly, for women we would expect (0.2168) × 3344 = 0.0158 to always report excellent health in our sample, implying that under the random model, on average, a sample size of around (1/0.0158) × 3344 = 21,164 is required to observe one woman in excellent health in every period. By contrast we observe 83 women who always report excellent health. This paper aims to decompose this observed persistence in health outcomes into components attributable to state dependence and unobserved heterogeneity, along with the effects of observed socioeconomic characteristics.1

The second objective of the paper is to explore the consequences of health-related attrition. In using a panel dataset like the BHPS to analyse health dynamics there is a risk that the results will be contaminated by survivorship bias. There is attrition from the panel at each wave and some of it is expected to be health-related; due to deaths, serious illness and people moving into institutional care.2 So the long-term survivors who remain in the panel are likely to be healthier on average. The health of survivors will tend to be higher than the population as a whole and their rate of decline in health will tend to be lower. So, failing to account for attrition may result in misleading estimates of health dynamics and of the relationship between health and socioeconomic characteristics. To address this issue we describe the pattern of health-related attrition revealed by the BHPS data and we test and correct for attrition in our empirical models.

The third objective is to consider the relationship between health and socioeconomic status. In particular we investigate the relationship between SAH and household income. This is of particular interest in the context of the recent focus on the impact of poverty and deprivation on health (e.g. Benzeval et al., 2000; Deaton, 2003). Previous analyses of this issue using BHPS (e.g. Benzeval et al., 2000) have employed simple empirical models and measures of income which have not exploited fully the panel dimension of the data. The empirical models used here allow for persistent unobservable effects and make full use of the outcome information contained in the dataset. Previous literature concerning health dynamics has considered the relationship between health and schooling (e.g. Grossman, 2000). We analyse whether the dynamics of health vary with levels of education. This is of particular relevance when considering interventions to improve health. To investigate this issue we estimate the empirical models after splitting the data by both gender and the highest academic qualifications attained at the beginning of the survey. By conditioning on previous health outcomes we are able to reduce fears of bias due to reverse causality (see Adams et al., 2003).

The use of a categorical measure of SAH leads us to use dynamic panel ordered probit models. This raises some methodological challenges that include dealing with correlated individual effects, the initial conditions problem and attrition bias. We adopt the approach suggested by Wooldridge (2002a) to deal with correlated individual effects and the problem of initial conditions in non-linear models with unobserved effects and lagged dependent variables. This problem is due to the generic feature that the starting point of a survey is not the beginning of a process, and that individuals inherit different unobserved and time-invariant characteristics which affect outcomes in every period. These phenomena lead to endogeneity bias in dynamic models with covariance structures that are not diagonal. Secondly, we explore the role of sample attrition in the BHPS. We apply variable addition tests for attrition bias (Verbeek and Nijman, 1992) and inverse probability weighting to adjust for attrition in estimation of pooled models (Wooldridge, 2002b).

The structure of the paper is as follows. Section 2 introduces the BHPS data and describes the samples and variables we use for estimation. In Section 3 we introduce the empirical models and estimation strategy. Section 4 reports and discusses the results and a conclusion is provided in Section 5.

2. THE BHPS DATASET

2.1. Sample and Variables

In estimating the empirical models for self-assessed health we exploit the panel data available in the first eight waves (1991–1998) of the British Household Panel Survey (BHPS).3 This includes rich information on occupational, sociodemographic and health variables. The BHPS is a longitudinal survey of private households in Great Britain (England, Wales and Scotland south of the Caledonian Canal), and was designed as an annual survey of each adult (16+) member of a nationally representative sample of more that 5000 households, with a total of approximately 10,000 individual interviews. The first wave of the survey was conducted between 1st September 1990 and 30th April 1991. The initial selection of households for inclusion in the survey was performed using a two-stage stratified systematic sampling procedure designed to give each address an approximately equal probability of selection.4 The same individuals are re-interviewed in successive waves and, if they split off from their original households are also re-interviewed along with all adult members of their new households. In this analysis we use both balanced samples of respondents, for whom information on all the required variables is reported at each wave, and unbalanced samples that exploit all available observations. The unbalanced sample does not include new entrants but tracks all of those who are observed at wave 1. The issue of sample attrition is discussed below.

Self-assessed Health

Table I summarizes the variables used in our empirical models of health dynamics. The health variable (SAH) is defined by a response to: ‘Please think back over the last 12 months about how your health has been. Compared to people of your own age, would you say that your health has on the whole been excellent/good/fair/poor/very poor?’ SAH should therefore be interpreted as indicating a perceived health status relative to the individual's concept of the ‘norm’ for their age group. In any case, we condition on a quartic function of age in the empirical analysis.

Table I. Variable definitions
SAHSelf-Assessed Health: 5 if excellent, 4 if good, 3 if fair, 2 if poor, 1 if very poor
WIDOW1 if widowed, 0 otherwise
SINGLE1 if never married, 0 otherwise
DIV/SEP1 if divorced or separated, 0 otherwise
NON-WHITE1 if a member of ethnic group other than white, 0 otherwise
DEGREE1 if highest academic qualification is a degree or higher degree, 0 otherwise
HND/A1 if highest academic qualification is HND or A level, 0 otherwise
O/CSE1 if highest academic qualification is O level or CSE, 0 otherwise
HHSIZENumber of people in household including respondent
NCHO4Number of children in household aged 0–4
NCH511Number of children in household aged 5–11
NCH1218Number of children in household aged 12–18
INCOMEEquivalized annual real household income in pounds
AGEAge in years at 1st December of current wave

SAH has been used widely in previous studies of the relationship between health and socioeconomic status (e.g. Adams et al., 2003; Benzeval et al., 2000; Deaton and Paxson, 1998; Ettner, 1996; Frijters et al., 2003; Salas, 2002; Smith, 1999) and of the relationship between health and lifestyles (e.g. Kenkel, 1995; Contoyannis and Jones, 2004). SAH is a simple subjective measure of health that provides an ordinal ranking of perceived health status. However, it has been shown to be a powerful predictor of subsequent mortality (see e.g. Idler and Kasl, 1995; Idler and Benyamini, 1997) and its predictive power does not appear to vary across socioeconomic groups (see e.g. Burström and Fredlund, 2001). Socioeconomic inequalities in SAH have been a focus of research (see e.g. van Doorslaer et al., 1997; van Doorslaer and Koolman, 2002; van Ourti, 2003) and have been shown to predict inequalities in mortality (see e.g. van Doorslaer and Gerdtham, 2003). Categorical measures of SAH have been shown to be good predictors of subsequent use of medical care (see e.g. van Doorslaer et al., 2000, 2002).

However, as a self-reported subjective measure of health, SAH may be prone to measurement error.5 It is sometimes argued that the mapping of ‘true health’ into SAH categories may vary with respondent characteristics. This source of measurement error has been termed ‘state-dependent reporting bias’ (Kerkhofs and Lindeboom, 1995), ‘scale of reference bias’ (Groot, 2000) and ‘response category cut-point shift’ (Sadana et al., 2000; Murray et al., 2001). This occurs if subgroups of the population use systematically different cut point levels when reporting their SAH, despite having the same level of ‘true’ health. In the context of ordered probit models, the symptoms of cut point shift can be captured by making the cut points dependent on some or all of the exogenous variables used in the model and estimating a generalized ordered probit. This requires strong a priori restrictions on which variables affect health and which affect reporting in order to separately identify the influence of variables on latent health and on measurement error. It is worth noting that allowing the scaling of SAH to vary across individual characteristics is equivalent to a heteroskedastic specification of the underlying latent variable equation (see e.g. van Doorslaer and Jones, 2003). This is because location and scale cannot be separately identified in binary and ordered choice models and, in general, it is not possible to separate measurement error from heterogeneity. Attempts to surmount this problem include modelling the reporting bias based on more ‘objective’ indicators of true health (Kerkhofs and Lindeboom, 1995; Lindeboom and van Doorslaer, 2003) and the use of ‘vignettes’ to fix the scale (Murray et al., 2001). Lindeboom and van Doorslaer (2003) analyse SAH in the Canadian National Population Health Survey and use the McMaster Health Utility Index (HUI-3) as their objective measure of health. They find evidence of cut point shift with respect to age and gender, but not for income, education or linguistic group. In our analysis we always split the sample by gender before estimating ordered probit models and we carry out a sensitivity analysis by estimating the panel data regressions for SAH on subsamples defined by age group, educational qualifications and income quartile. Evidence of heterogeneity in the effects of socioeconomic characteristics on health across these subsamples could indicate measurement error (van Doorslaer and Jones, 2003).

Other Variables

Income is measured as equivalized and RPI deflated annual household income (INCOME). This variable is transformed to natural logarithms to allow for concavity of the health–income relationship (e.g. Ettner, 1996; van Doorslaer and Koolman, 2002; Frijters et al., 2003). Other variables included are marital status (WIDOW, SINGLE, DIV/SEP) and the highest educational qualification attained by the end of the sample period in descending order of attainment (DEGREE, HND/A, O/CSE). Married or living as a couple is the excluded category for marital status. Similarly, NO-QUAL (no academic qualifications) is excluded for the educational variable. We include an indicator of ethnic origin (NON-WHITE), the number of individuals living in the household including the respondent (HHSIZE), and the numbers of children living in the household at different ages (NCH04, NCH511, NCH1218). Age is included as a fourth-order polynomial (AGE, AGE2 = AGE2/100, AGE3 = AGE3/10,000, AGE4 = AGE4/1,000,000), and a vector of time dummies are included to account for aggregate health shocks, time-varying reporting changes and any effects of age which are not captured by the polynomial.

2.2. Data Description

Figure 1 describes the distribution of SAH across all eight waves.6 The distributions show a long right tail, with the majority of observations being either excellent or good. The figure shows an interesting pattern; although the interviewee is asked to report their self-assessed health relative to a representative person of their own age, there is a trend for the distribution of health to become worse over time. This can be seen by the steady increase in the proportions of observations in the fair, poor and very poor categories, while there is a gradual decrease in the proportion reporting excellent health. The age profile of SAH is reinforced by Figure 2, which shows the distribution by age groups at wave 1.

Figure 1.

Self-assessed health status by wave

Figure 2.

Self-assessed health status by age group at first wave

Figure 3 displays the relationship between health and income by showing the distribution of self-assessed health, pooled over eight waves, by quintiles of mean equivalized real household income. The figure shows that the distribution of SAH improves as household income increases; as we move up the income distribution the proportions of observations in the excellent and good categories increase while those in the fair, poor and very poor categories decrease. An alternative way of considering this pattern is to consider the empirical CDF of household income by self-assessed health status; where the y-axis shows the proportion of observations with household income below the values given on the x-axis. If a positive relationship exists between health and income we would expect the proportion of observations with income below a particular level to be higher when self-assessed health is poorer; we would expect the CDFs for successively lower levels of health to be successively above those for higher levels of health. This is observable in Plate 1, which plots the empirical distribution functions of equivalized real income for the different categories of SAH. Moving from left to right to compare the distribution of income across increasing levels of SAH, these show evidence of stochastic dominance.

Figure 3.

Self-assessed health status by quintile of mean income

Plate 1.

Empirical CDFs of mean income by self-assessed health status

Educational status is often considered as a determinant of health and source of health inequalities, and it is of interest to consider how health is related to educational attainment in our data. Figure 4 shows the distribution of SAH, pooled over all eight waves, by maximum educational attainment. This reveals a positive gradient between education and health for both men and women.

Figure 4.

Self-assessed health status by maximum educational attainment

State Dependence

The main focus of this paper is on health dynamics: how does health status in the previous period affect the probability distribution of current health status? While the model-based approaches allow us to condition on other variables when ascertaining the effect of previous health status, Figure 5 and Table II inform this aspect of health dynamics without conditioning on other variables. Figure 5 shows the distribution of SAH at wave 2 by SAH at wave 1. Persistence in health outcomes is observable from the figure. For the most extreme cases, it is clear that the probabilities of transitions to very poor health from excellent health, or the reverse, are almost zero. This observation can be generalized; individuals are far more likely to remain close to their initial state than move far away from it. An alternative way of seeing this is by considering the transition matrices in Table II. Here the rows indicate previous health state while the columns indicate the current state; e.g. the elements of the first row provide information on the conditional distribution of SAH at t given excellent SAH at t − 1. The elements of the table can be interpreted as the conditional probabilities under a Markov model. Persistence is again observable by considering the relative magnitudes of the diagonal elements and those close to them compared to those far from the diagonal.

Figure 5.

Self-assessed health at wave 2 by self-assessed health at wave 1

Table II. Transition matrices, balanced panel
(a) Men
SAHEXGOODFAIRPOORVERY POORN
EX0.6000.3420.0460.0100.002 5485
GOOD0.1840.6510.1420.0190.004 9263
FAIR0.0550.3610.4710.1000.012 3433
POOR0.0290.1200.3400.4180.093 1031
VERY POOR0.0320.0730.1330.4230.339  248
N523192873565111126619 460
(b) Women
SAHEXGOODFAIRPOORVERY POOR  N
EX0.5720.3530.0590.0130.004 5164
GOOD0.1500.6570.1620.0260.00511 306
FAIR0.0400.3620.4650.1160.017 4928
POOR0.0210.1560.3600.3650.098 1587
VERY POOR0.0140.1060.1920.3260.362  423
N488411 3295082164946423 408

Socioeconomic Variables

Table III presents means for the regressors used in our empirical models for three different samples. The first sample uses all available observations for the variable in question. The second is the unbalanced estimation sample, which uses all available observations at each wave that provide complete information on the variables used in the empirical models and were observed at wave 1. The third is the balanced estimation sample which uses observations for which all relevant variables are measured in all eight waves. Comparison of the three samples shows the impact of attrition on the means of the observed characteristics. Overall, differences in means between the first sample and the unbalanced sample are very small. Comparison of the unbalanced and balanced samples shows that the differences are also relatively small but the balanced sample contains marginally more women, individuals who are married or cohabiting, individuals who have formal educational qualifications and higher incomes on average.

Table III. Variable means
 ALL AVAILABLE DATA ON VARIABLEUNBALANCED ESTIMATION SAMPLE NT = 64,053BALANCED ESTIMATION SAMPLE NT = 48,992
MALE0.4610.4620.454
WIDOW0.0890.0890.079
SINGLE0.1630.1610.144
DIV/SEP0.0680.0690.068
NON-WHITE0.0400.0340.029
DEGREE0.0960.1080.115
HND/A0.2020.2150.226
O/CSE0.2720.2800.287
HHSIZE2.7882.7912.815
NCHO40.1440.1450.149
NCH5110.2600.2600.271
NCH12180.1830.1830.186
ln(INCOME)9.4989.4989.530
AGE47.047.046.8

Table IV presents sample means for men and women respectively. In order to obtain a parsimonious and informative description of the variables we select subsamples of the data based on particular sequences of outcomes.7 The first column of the table presents the sample means for the full samples of men and women. The second column contains information on those who were in excellent or good health in every period, while the third column describes the data for those who were in poor or very poor health for all eight waves. The remaining columns present data for those who made transitions in their health status over the sample period. The fourth column presents results for those who made a single move from excellent or good health, while the fifth column presents means for those who made a single move from poor or very poor health. Comparing across the columns of the table allows us to explore, without considering causality, whether certain variables are correlated with health outcomes.

Table IV. Variable means by subsample, balanced panel
(a) Men
 FULLExcellent/good 8 yrsPoor/very poor 8 yrsSingle move from excellent/goodSingle move from poor/very poor
 N = 2780N = 1163N = 25N = 252N = 51
 NT = 22 240NT = 9304NT = 200NT = 2016NT = 408
SAH3.9474.4711.6153.6253.299
WIDOW0.0320.0210.060.0470.056
SINGLE0.1750.1670.030.1470.213
DIV/SEP0.0510.0430.070.0670.098
NON-WHITE0.0300.02600.0440.059
DEGREE0.1340.1690.040.1230.078
HND/A0.2710.3040.20.2580.275
O/CSE0.2590.2800.120.2180.275
HHSIZE2.8772.9772.722.7642.713
NCH040.1440.1650.040.1090.105
NCH5110.2560.2830.1850.2080.159
NCH12180.1840.2050.180.1710.174
ln(INCOME)9.5899.7109.2229.5289.470
AGE46.144.453.349.748.4
(b) Women
 FULLExcellent/good 8 yrsPoor/very poor 8 yrsSingle move from excellent/goodSingle move from poor/very poor
 N = 3344N = 1123N = 36N = 306N = 88
 NT = 26 752NT = 8984NT = 288NT = 2448NT = 704
SAH3.8064.4321.5313.6033.260
WIDOW0.1190.0960.2290.1500.091
SINGLE0.1190.1140.0550.1270.189
DIV/SEP0.0820.0770.1980.0940.114
NON-WHITE0.0280.0200.0560.0390.045
DEGREE0.0980.12600.0820.091
HND/A0.1890.23300.1670.239
OCSE0.3100.3550.1670.2910.272
HHSIZE2.762.832.5972.6042.915
NCH040.1540.1570.0940.1140.232
NCH5110.2840.3070.2260.2240.335
NCH12180.1880.2000.0970.1920.192
ln(INCOME)9.4829.6249.1499.4339.331
AGE47.345.657.150.743.0

For men, the means of ln(INCOME) imply that average income for those in excellent or good health in all eight years is almost 40% greater than that for those in poor or very poor health in all eight years, and is around 13% greater than that of all observations combined. For women the figures are 60% and 15%. Those always in excellent or good health are younger than those always in poor or very poor health, while those who make a single move from excellent or good health are slightly older than those who make a single move from poor or very poor health. Those who always report poor or very poor health have a much lower level of academic attainment than those who always report excellent or good health, and to a lesser degree, have more qualifications than those who made a single move from excellent/good or poor/very poor health.

Non-response and Attrition Bias

Table V shows how the sample size and composition evolves across waves. The table, which gives figures for the whole sample and the subsamples of men and women, shows the number of observations that are available at each wave and the corresponding number of drop-outs between waves. These are expressed as wave-on-wave survival and attrition rates.8 Attrition rates are highest between waves 1 and 2, with the rate tending to decline over time. The table also disaggregates the attrition rates according to individuals' SAH at wave 1. This shows that attrition rates are inversely related to initial health and, in particular, attrition is highest among those who start the survey in very poor health. This pattern of health-related attrition persists throughout the panel and is stronger for men than women.

Table V. Sample size, drop-outs and attrition rates by wave
(a) All data
FULL SAMPLEEX at t − 1GOOD at t − 1FAIR at t − 1POOR at t − 1VPOOR at t − 1
WaveNo. individualsSurvival rateDrop-outsAttrition rateAttrition rateAttrition rateAttrition rateAttrition rateAttrition rate
110 256        
2 895787.33%129912.67%11.54%12.57%13.01%13.73%23.74%
3 816279.58% 7958.88%8.08%8.13%9.65%12.62%19.46%
4 782576.30% 3374.13%6.67%6.54%6.73%10.35%14.74%
5 743072.45% 3955.05%6.21%6.18%7.87%9.11%16.34%
6 723870.57% 1922.58%3.11%3.24%5.06%10.47%18.83%
7 710269.25% 1361.88%3.15%3.85%4.79%8.83%8.75%
8 683966.68% 2633.70%3.43%3.82%5.30%5.88%17.01%
(b) Men
FULL SAMPLEEX at t − 1GOOD at t − 1FAIR at t − 1POOR at t − 1VPOOR at t − 1
WaveNo. individualsSurvival rateDrop-outsAttrition rateAttrition rateAttrition rateAttrition rateAttrition rateAttrition rate
14832        
2418086.51% 65213.49%12.17%13.45%14.23%14.63%26.88%
3375277.65% 42810.24%8.92%9.51%11.49%14.58%24.00%
4359374.36% 1594.24%6.65%7.40%7.29%8.52%14.52%
5339270.20% 2015.59%5.40%7.42%9.61%9.72%22.95%
6330868.46%  842.48%3.56%3.05%4.80%12.16%25.40%
7324967.24%  591.78%3.27%4.46%4.62%9.65%11.48%
8310564.26% 1444.43%4.06%4.43%6.36%7.00%22.89%
(c) Women
FULL SAMPLEEX at t − 1GOOD at t − 1FAIR at t − 1POOR at t − 1VPOOR at t − 1
WaveNo. individualsSurvival rateDrop-outsAttrition rateAttrition rateAttrition rateAttrition rateAttrition rateAttrition rate
1  5424        
2  477788.07% 64711.93%10.83%11.81%12.06%13.16%21.43%
3  441081.31% 3677.68%7.19%6.94%8.21%11.33%16.36%
4  423278.02% 1784.04%6.69%5.82%6.32%11.53%14.89%
5  403874.45% 1944.58%7.12%5.15%6.67%8.61%11.96%
6  393072.46% 1082.67%2.63%3.40%5.24%9.27%14.29%
7  385371.04%  771.96%3.02%3.34%4.91%8.26%7.07%
8  373468.84% 1193.09%2.74%3.32%4.53%5.16%12.61%

Table VI shows that the overall attrition rate across all eight waves of the panel varies with socioeconomic characteristics. The average rate of attrition over eight waves is 33%. As expected, attrition increases with individuals' age at the start of the panel, ranging from 31% for those aged under 30 to 59% for those aged over 70. Much of this age-related attrition is likely to be associated with health, through deaths, serious illness and moves to institutional care. Attrition is greater among those with lower income and with less formal education and is particularly high among those who had never married at the start of the panel (61%). The latter may be associated with higher occupational and geographic mobility among young single people. The table also shows that health-related attrition interacts with individuals' socioeconomic characteristics.9 So, for example, attrition rates are very high among elderly individuals (aged > 70) who start the survey in poor (78%) or very poor health (90%).

Table VI. Attrition rates over eight waves by income, educational and marital status
 FULL DATAEXGOODFAIRPOORVPOOR
ALL DATA
 32.6129.4230.6137.0840.2557.08
AGE GROUP
 <3031.2133.0231.0931.5217.5422.2
 31–5026.1026.0924.7028.9024.7243.75
 51–7032.2926.0630.8534.8143.6748.65
 >7058.5343.6550.7465.2078.3390.48
INCOME QUINTILE
 146.2943.1344.7748.0349.1660.00
 233.3329.7329.0038.8341.3057.35
 329.9925.6529.3935.9034.1344.12
 428.3328.4527.4026.5535.1166.67
 527.7428.1127.0826.5031.5857.14
EDUCATION
 DEGREE19.3919.3719.7420.6610.710.00
 HND/A25.3425.9124.5823.3323.1762.96
 O/CSE28.8029.1326.9233.5726.9640.00
 NOQUAL39.4934.3837.2142.9746.5156.72
MARITAL STATUS
 WIDOW25.759.2116.3333.9150.5463.41
 SINGLE60.6964.3460.1958.2654.9537.50
 DIV/SEP12.3811.830.0020.0035.2936.67
 MARRIED26.5320.6025.2033.0835.2263.71

Tables VI provides only a description of simple bivariate relationships between attrition rates and socioeconomic characteristics. To extend this to a multivariate analysis Tables VII and VIII present probit models for response/non-response at each wave of the panel, from wave 2 to wave 8, using the full sample of individuals who are observed at wave 1. The dependent variables for these probits equal 1 if the individual responds at the wave in question and 0 otherwise. The probability of response is modelled as a function of the wave 1 values of all of the regressors that are included in our empirical model of SAH. The table shows the partial effects of the regressors on the probability of response at each wave, along with an indication of which of these are statistically significant at a 5% level.10 These results reveal statistically significant associations between non-response and levels of educational attainment for both men and women. Those with DEGREE, HND/A and O/CSE qualifications are more likely to remain in the sample and the magnitude of this effect increases over the waves. On average, a man with a degree has a 0.084 higher probability of responding at wave 2, relative to one without academic qualifications. By wave 8 they have a 0.179 higher probability of responding. For women the corresponding figures are 0.084 and 0.180. Non-whites are less likely to remain in the sample, and this effect increases in magnitude as time progresses. By wave 8 the probability of responding among non-white men is 0.146 lower and among women it is 0.152 lower. There is no evidence of statistically significant income-related attrition among men while there is some evidence of an effect for women, at least in waves 3–5.

Table VII. Probit models for response/non-response by wave (results are presented as partial effects on the probability of responding at wave t, evaluated at the sample means of the regressors)—men
WAVE2345678
  • *

    Denotes p ≤ 0.05.

ln(INCOME)0.0120.0190.0210.0080.0070.0070.008
WIDOW−0.013−0.0070.0010.0410.0360.0010.005
SINGLE−0.013−0.041*−0.047*−0.062*−0.053*−0.060*−0.074*
DIV/SEP−0.027−0.022−0.027−0.0050.008−0.009−0.025
NON-WHITE−0.113*−0.174*−0.163*−0.171*−0.144*−0.136*−0.146*
DEGREE0.084*0.158*0.182*0.192*0.171*0.168*0.179*
HND/A0.076*0.122*0.118*0.123*0.123*0.124*0.137*
O/CSE0.052*0.074*0.078*0.089*0.085*0.092*0.104*
HHSIZE−0.023*−0.024*−0.030*−0.025−0.013−0.008−0.007
NCH1040.069*0.069*0.072*0.072*0.047*0.043*0.041*
NCH5110.036*0.0300.040*0.0200.0120.009−0.002
NCH12180.047*0.050*0.058*0.0470.0220.0210.014
AGE−0.021−0.027−0.039−0.045−0.037−0.032−0.032
AGE20.0730.0900.1260.1450.1270.1130.104
AGE3−0.090−0.106−0.149−0.166−0.154−0.136−0.110
AGE40.0360.0380.0550.0560.0530.0440.023
SAHEX(1)0.0040.014−0.004−0.007−0.019−0.025−0.019
SAHFAIR(1)0.003−0.006−0.029−0.047*−0.072*−0.071*−0.070*
SAHPOOR(1)0.004−0.009−0.003−0.023−0.051−0.051−0.050
SAHVPOOR(1)−0.109*−0.105*−0.113*−0.133*−0.186*−0.204*−0.198*
Log likelihood−1833.6−2527.5−2658.3−2838.7−2885.8−2917.3−2992.7
N = 4828       
Table VIII. Probit models for response/non-response by wave (results are presented as partial effects on the probability of responding at wave t, evaluated at the sample means of the regressors)—women
WAVE2345678
  • *

    Denotes p ≤ 0.05.

ln(INCOME)0.0090.026*0.027*0.025*0.0200.0080.012
WIDOW0.031*0.0340.0340.0140.0020.0060.003
SINGLE−0.027−0.044*−0.064*−0.075*−0.074*−0.073*−0.070*
DIV/SEP−0.026−0.030−0.016−0.019−0.042−0.057*−0.060*
NON-WHITE−0.075*−0.129*−0.152*−0.143*−0.135*−0.132*−0.152*
DEGREE0.084*0.143*0.140*0.169*0.176*0.177*0.180*
HND/A0.053*0.102*0.098*0.116*0.112*0.103*0.117*
O/CSE0.036*0.048*0.052*0.066*0.069*0.070*0.069*
HHSIZE−0.012*−0.014*−0.018*−0.016*−0.016*−0.00070.0003
NCH1040.045*0.047*0.050*0.046*0.054*0.0350.031
NCH5110.031*0.031*0.032*0.0190.0210.0040.001
NCH12180.031*0.027*0.030*0.033*0.039*0.029*0.020
AGE0.0220.0490.0330.0380.0250.0280.019
AGE2−0.068−0.158−0.109−0.131−0.083−0.086−0.068
AGE30.0970.226*0.1640.1990.1340.1340.125
AGE4−0.051−0.118*−0.092−0.112*−0.083−0.083−0.087
SAHEX(1)0.0010.002−0.002−0.012−0.009−0.014−0.004
SAHFAIR(1)0.0090.0060.0050.0100.005−0.0050.001
SAHPOOR(1)0.005−0.022−0.033−0.022−0.031−0.043−0.042
SAHVPOOR(1)−0.064*−0.067−0.073−0.078−0.119*−0.151*−0.178*
Log likelihood−1920.2−2677.6−2798.6−3003.8−3035.8−3082.2−3155.8
N = 5421       

The pattern of health-related attrition is striking. For both men and women very poor initial health (SAHVPOOR(1)) stands out as the main source of health-related attrition. For men the effect of very poor health is statistically significant throughout (relative to the reference category of good initial health). The reduction in the probability grows from 0.109 at wave 2 to 0.198 at wave 8. For women the effect is 0.064 at wave 2 growing to 0.178 at wave 8. However, on the whole, the association between initial health and non-response seems to be limited to those in very poor health, although the impact of fair (SAHFAIR(1)) and poor (SAHPOOR(1)) initial health among men and of poor initial health among women does increase as the panel lengthens. In assessing the likely impact of this health-related attrition it is worth bearing in mind that only 1.5% of men and 1.9% of women report very poor health in wave 1.

3. MODELS AND ESTIMATION METHODS

To model self-assessed health we use dynamic panel ordered probit specifications on both balanced and unbalanced samples. We include previous health states in our empirical models in order to capture state dependence and the model can be interpreted as a first-order Markov process. Our models should be viewed as reduced form specifications as they do not include objects of choice, such as medical care, or other health inputs, such as lifestyle. The latent variable specification of the (reduced form) model that we estimate can be written as:

equation image(1)

where xit is a set of observed variables which may be associated with the health indicator.11 To capture state dependence, hit−1 is a vector of indicators for the individual's health state in the previous wave and the γ are parameters to be estimated. αi is an individual-specific and time-invariant random component. εit is a time and individual-specific error term which is assumed to be normally distributed and uncorrelated across individuals and waves and uncorrelated with αi. εit is assumed to be strictly exogenous, that is, the xit are uncorrelated with εis for all t and s.12 As we do not have a natural scale for the latent variable the variance of the idiosyncratic error term is restricted to equal one.13

In our data the latent outcome h*it is not observed. Instead, we observe an indicator of the category in which the latent indicator falls (hit). The observation mechanism can be expressed as:

equation image(2)

where µ0 = − ∞, µj ≤ µj+1, µm = ∞. Given the assumption that the error term is normally distributed, the probability of observing the particular category of SAH reported by individual i at time t(hit), conditional on the regressors and the individual effect, is:

equation image(3)

where Φ(·) is the standard normal distribution function. This formulation makes it clear that it is not possible to separately identify an intercept in the linear index (β0) and the cut points (µ), the model only identifies (µj − β0). To deal with this we have adopted a conventional normalization, setting β0 = 0 (an alternative is to set µ1 = 0). By extension, it is clear that, without a priori restrictions, the individual effect (αi) cannot be distinguished from an individual-specific cut point shift. The same argument applies to the impact of the regressors on h*it so long as the cut points are a linear function of the regressors.14 This should be borne in mind when interpreting the results presented below.

To implement the random effects estimator the individual effect can be integrated out, using the assumption that its density is N(0, σα2), to give the sample log-likelihood function

equation image(4)

This expression contains a univariate integral which can be approximated by Gauss–Hermite quadrature.

The ordered probit models are estimated using pooled ordered probit and random effects ordered probit estimators which are both available in STATA (Release 7.0, Stata Corporation).15 The cut points and the proportion of variance due to the individual effect are determined by the estimation routine.

3.1. Correlated Effects and Initial Conditions

To allow for the possibility that the observed regressors may be correlated with the individual effect we parameterize the individual effect (Mundlak, 1978; Chamberlain, 1984; Wooldridge, 2002a). This allows for correlation between the individual effects and the means of the regressors. In addition, because we are estimating dynamic models we need to take account of the problem of initial conditions. Heckman (1981) describes two assumptions that are typically invoked concerning a discrete time stochastic process with binary outcomes. The same issues arise with an ordered categorical variable. The first assumption is that the initial observations are exogenous variables. This is invalid when the error process is not serially independent and the first observation is not the true initial outcome of the process. In our case, the latter condition is violated, while the former is unlikely to be correct. Treating the lagged dependent variables as exogenous when these assumptions are incorrect leads to inconsistent estimators. The second assumption often invoked is that the process is in equilibrium such that the marginal probabilities have approached their limiting values and can therefore be assumed time-invariant. This assumption is untenable when non-stationary variables such as age and time trends are included in the model as we do here.

Wooldridge (2002a) has suggested an approach to deal with the initial conditions problem in non-linear dynamic random effects models by modelling the distribution of the unobserved effect conditional on the initial value and any exogenous explanatory variables. This conditional maximum likelihood (CML) approach results in a likelihood function based on the joint distribution of the observations conditional on the initial observations. Parameterizing the distribution of the unobserved effects leads to a likelihood function that is easily maximized using pre-programmed commands with standard software (e.g. STATA). However it should be noted that the CML approach does specify a complete model for the unobserved effects and may therefore be sensitive to misspecification.

We implement this approach by parameterizing the distribution of the individual effects as:

equation image(5)

where i is the average over the sample period of the observations on the exogenous variables. ui is assumed to be distributed equation image and independent of the x variables, the initial conditions, and the idiosyncratic error term (εit). Substituting equation (5) into equation (1) gives a model that has a random effects structure, with the regressors at time t augmented to include hi1 (a vector of dummy variables) and i. Three features should be noted. First, this specification implies that the identified coefficients of any time-invariant regressors are composite effects of the relevant elements of β and α2.16 Secondly, all time dummies must be dropped from i to avoid perfect collinearity. Thirdly, the estimates of α1 are also of interest as they are informative about the relationship between the individual effect and initial health. We would expect there to be a positive gradient in the coefficient estimates as we move from very poor to excellent health.

3.2. Attrition Bias

Tables V–VIII have shown evidence of systematic patterns of attrition by socioeconomic characteristics and previous levels of health, but it remains to be seen whether this will lead to attrition bias in our empirical models of SAH. To test for attrition bias we use simple variable addition tests as proposed by Verbeek and Nijman (1992, p. 688). The test variables we use are (i) an indicator for whether the individual responds in the subsequent wave (NEXT WAVE), (ii) an indicator of whether the individual responds in all eight waves and, hence, is in the balanced sample (ALL WAVES) and (iii) a count of the number of waves that are observed for the individual (NUMBER OF WAVES). Each of these are added to our dynamic correlated effects ordered probit model, given by equations (1) and (5) and estimated with the unbalanced sample. This gives three separate tests for attrition bias. These tests may have low power and do not correct the estimates for attrition bias (Verbeek, 2000). Additional evidence can be provided by Hausman-type tests that compare estimates from the balanced and unbalanced samples. However, in the context of ordered probit models, this is complicated by the fact that the coefficients from the two specifications do not have a common scale. Instead we rely on a comparison of average partial effects, which do share the same scale.

To allow for attrition we adopt an inverse probability weighted (IPW) estimator and apply it to the pooled ordered probit model (Wooldridge, 2002b, c).17 To implement this estimator we estimate (probit) equations for response (dit = 1) versus non-response (dit = 0) at each wave, t = 2, …, 8, conditional on a set of characteristics (zi1) that are measured for all individuals at the first wave. This relies on ‘selection on observables’ and implies that attrition can be treated as ignorable non-response, conditional on zi1 (Fitzgerald et al., 1998; Wooldridge, 2002c, p. 588).18 In practice zi1 includes the initial values of all of the regressors, including initial health states. Also it includes initial values of other indicators of morbidity (whether the individual reports a limiting health problem, whether they report a disability, and their GHQ-12 score which indicates their psychological well-being as measured by the general health questionnaire) along with initial values of their activity status (self-employed, unemployed, retired, maternity leave, caring for the family, student and long-term sick, with employed as the reference category). The probits for response/non-response are estimated at each wave of the panel, from wave 2 to wave 8, using the full sample of individuals who are observed at wave 1. The inverse of the fitted probabilities from these models, 1/it, are then used to weight observations in the ML estimation of the pooled ordered probit model using:19

equation image(6)

Wooldridge (2002b) shows that, under the ignorability assumption:

equation image(7)

the IPW estimator is equation image consistent and asymptotically normal. Wooldridge also shows that using the estimated it rather than the true pit and ignoring the implied adjustment to the estimated standard errors leads to ‘conservative inference’ so that the standard errors are larger than they would be with an adjustment for the use of fitted rather than true probabilities. Therefore we do not adjust the standard errors.20

The IPW estimator can be adapted to allow the elements of z to be updated and change across time, for example adding z variables measured at t − 1 to predict response at t. This should improve the power of the probit models to predict non-response and hence make the ignorability assumption more plausible. In this case the probit model for attrition at wave t is estimated relative to the sample that is observed at wave t − 1. This relies on attrition being an absorbing state and is therefore confined to ‘monotone attrition’ where respondents never re-enter the panel. Also, because estimation at each wave is based on the selected sample observed at the previous wave, the construction of inverse probability weights has to be adapted. The predicted probability weights are constructed cumulatively using equation image, where the equation image denote the fitted selection probabilities from each wave. In this version of the estimator the ignorability condition has to be extended to include future values of h and x (see Wooldridge, 2002c, p. 589). Once again Wooldridge shows that omitting a correction to the asymptotic variance estimator leads to conservative inference so we do not adjust the standard errors.

4. ESTIMATION RESULTS

The results for the various model specifications outlined above are reported in this section. Models for men and women are presented separately throughout.

4.1. Tests for Attrition

Table IX presents the variable addition tests for attrition bias estimated using the dynamic correlated effects ordered probit model. All three test variables show evidence of attrition for men and also suggest attrition among women, although the p-values on the statistics are higher. The positive coefficients on the test variables are consistent with the fact that response rates are positively associated with health. Adding these test variables to the model is not intended to ‘correct’ the estimates for attrition, but it is informative to compare the estimates with the baseline model that does not include the test variables. It is striking that, for key variables such as income and lagged health state, the differences between the estimated coefficients are negligible. This suggests that attrition may not bias the estimates of these effects, a result that is reinforced below.

Table IX. Verbeek and Nijman tests for attrition: based on dynamic ordered probit models with Wooldridge specification of correlated effects and initial conditions
 MENWOMEN
βStd. err.t-Testp-ValueβStd. err.t-Testp-Value
NEXT WAVE0.1990.0355.670.0000.0600.0341.770.077
ALL WAVES0.1390.0314.460.0000.0710.0292.450.014
NUMBER OF WAVES0.0310.0093.540.0000.0160.0081.880.060

4.2. Estimates of Dynamic Ordered Probits

Tables X and XI present the coefficient estimates for the ordered probit models based on pooled and random effects specifications. In all cases we estimated a correlated effects version of the models by parameterizing the distribution of the unobserved individual effects as in equation (5). Both the pooled and random effects specifications were estimated on the balanced and unbalanced samples. The estimates for the pooled ordered probit models allow for serial correlation in the errors by using a robust estimator of the covariance matrix. In addition we estimated the pooled model using IPW to adjust for attrition. Both variants of the IPW estimator are presented: IPW-1 uses wave 1 regressors to predict non-response, IPW-2 also includes values from the previous wave as well as the initial wave and the sample is restricted by excluding observations that exhibit non-monotone attrition. For the random effects specifications we incorporated unobserved heterogeneity explicitly by including Gaussian random effects. These models were estimated by maximum likelihood using Gauss–Hermite quadrature with 12 evaluation points using reoprob.ado.21

Table X. Dynamic ordered probit models with Wooldridge specification of correlated effects and initial conditions (coefficients for year dummies and within means of demographics not reported)—men
 (1)(2)(3)(4)(5)(6)
Pooled model, balanced samplePooled model, unbalanced samplePooled model, inverse probability weights IPW-1Pooled model, inverse probability weights IPW-2Random effects, balanced sampleRandom effects, unbalanced sample
NT = 19,460NT = 24,371NT = 24,370NT = 23,211NT = 19,460NT = 24,371
  1. 1. Standard errors are reported in parentheses.

  2. 2. Cut1–4 are the estimated cut points.

  3. 3. ICC is the intra-class correlation coefficient, equation image.

ln(INCOME)0.036 (0.022)0.035 (0.019)0.035 (0.020)0.043 (0.021)0.059 (0.025)0.054 (0.021)
mean0.190 (0.032)0.171 (0.028)0.168 (0.028)0.177 (0.030)0.294 (0.045)0.257 (0.037)
ln(INCOME)
WIDOW0.005 (0.111)0.011 (0.106)0.018 (0.110)0.002 (0.111)0.017 (0.127)0.022 (0.116)
SINGLE−0.064 (0.056)−0.029 (0.052)−0.034 (0.052)−0.066 (0.054)−0.074 (0.065)−0.038 (0.059)
DIV/SEP0.110 (0.081)0.061 (0.069)0.050 (0.072)0.047 (0.077)0.103 (0.082)0.060 (0.074)
NON-WHITE−0.088 (0.056)−0.124 (0.046)−0.125 (0.047)−0.138 (0.050)−0.144 (0.094)−0.224 (0.074)
DEGREE0.040 (0.037)0.068 (0.032)0.069 (0.032)0.069 (0.033)0.068 (0.060)0.121 (0.051)
HND/A0.074 (0.029)0.083 (0.025)0.083 (0.025)0.085 (0.026)0.123 (0.047)0.134 (0.039)
O/CSE0.063 (0.028)0.079 (0.024)0.079 (0.025)0.077 (0.025)0.105 (0.046)0.127 (0.038)
HHSIZE0.016 (0.016)0.013 (0.015)0.015 (0.015)0.020 (0.016)0.019 (0.019)0.018 (0.017)
NCH104−0.034 (0.310)0.003 (0.028)−0.003 (0.028)−0.024 (0.029)−0.026 (0.037)0.008 (0.033)
NCH5110.006 (0.028)0.026 (0.025)0.017 (0.026)0.009 (0.026)0.013 (0.032)0.033 (0.029)
NCH1218−0.018 (0.029)−0.009 (0.027)−0.027 (0.027)−0.031 (0.029)−0.025 (0.034)−0.018 (0.031)
AGE0.011 (0.046)0.021 (0.038)0.027 (0.041)0.013 (0.042)0.084 (0.064)0.089 (0.051)
AGE2−0.080 (0.147)−0.110 (0.119)−0.134 (0.130)−0.097 (0.132)−0.320 (0.204)−0.337 (0.164)
AGE30.146 (0.194)0.184 (0.157)0.221 (0.173)0.179 (0.175)0.475 (0.274)0.493 (0.220)
AGE4−0.082 (0.092)−0.102 (0.074)−0.122 (0.082)−0.104 (0.083)−0.242 (0.131)−0.254 (0.104)
SAHEX(t − 1)0.782 (0.029)0.784 (0.025)0.784 (0.026)0.781 (0.027)0.348 (0.027)0.370 (0.025)
SAHFAIR−0.749 (0.028)−0.728 (0.025)−0.725 (0.025)−0.721 (0.027)−0.376 (0.029)−0.373 (0.026)
(t − 1)
SAHPOOR−1.51 (0.054)−1.45 (0.045)−1.43 (0.048)−1.43 (0.051)−0.824 (0.050)−0.812 (0.043)
(t − 1)
SAHVPOOR−2.06 (0.109)−2.02 (0.085)−2.05 (0.086)−2.04 (0.086)−1.12 (0.090)−1.12 (0.076)
(t − 1)
SAHEX(1)0.374 (0.027)0.352 (0.024)0.349 (0.024)0.351 (0.025)0.723 (0.039)0.691 (0.034)
SAHFAIR(1)−0.249 (0.029)−0.275 (0.025)−0.284 (0.026)−0.278 (0.026)−0.540 (0.049)−0.595 (0.041)
SAHPOOR(1)−0.529 (0.058)−0.529 (0.047)−0.547 (0.047)−0.550 (0.050)−1.18 (0.083)−1.19 (0.068)
SAHVPOOR(1)−0.660 (0.106)−0.726 (0.088)−0.722 (0.085)−0.685 (0.082)−1.48 (0.146)−1.68 (0.119)
Cut1−0.972 (0.579)−0.977 (0.467)−0.966 (0.496)−0.998 (0.507)0.547 (0.808)0.241 (0.646)
Cut20.119 (0.578)0.045 (0.466)0.064 (0.495)0.040 (0.506)1.81 (0.808)1.42 (0.646)
Cut31.29 (0.578)1.19 (0.466)1.21 (0.494)1.19 (0.505)3.17 (0.808)2.75 (0.647)
Cut43.00 (0.578)2.88 (0.466)2.88 (0.494)2.86 (0.506)5.15 (0.809)4.71 (0.648)
ICC    0.332 (0.012)0.330 (0.011)
Log likelihood−19 194.0−24 496.1−24 784.2−23 644.3−18 678.4−23 918.6
Table XI. Dynamic ordered probit models with Wooldridge specification of correlated effects and initial conditions (coefficients for year dummies and within means of demographics not reported)—women
 (1)(2)(3)(4)(5)(6)
Pooled model, balanced samplePooled model, unbalanced samplePooled model, inverse probability weights IPW-1Pooled model, inverse probability weights IPW-2Random effects, balanced sampleRandom effects, unbalanced sample
NT = 23,408NT = 28,619NT = 28,618NT = 27,232NT = 23,408NT = 28,619
  1. 1. Standard errors are reported in parentheses.

  2. 2. Cut 1–4 are the estimated cut points.

  3. 3. ICC is the intra-class correlation coefficient, equation image.

ln(INCOME)0.029 (0.021)0.033 (0.018)0.021 (0.019)0.018 (0.020)0.030 (0.023)0.040 (0.020)
mean0.127 (0.031)0.113 (0.026)0.132 (0.027)0.134 (0.029)0.197 (0.040)0.173 (0.034)
ln(INCOME)
WIDOW−0.004 (0.068)−0.030 (0.062)−0.029 (0.061)−0.007 (0.065)0.030 (0.078)−0.002 (0.071)
SINGLE−0.011 (0.062)−0.018 (0.055)−0.028 (0.056)−0.044 (0.060)−0.009 (0.066)−0.020 (0.060)
DIV/SEP−0.021 (0.061)−0.031 (0.055)−0.041 (0.056)−0.038 (0.059)−0.046 (0.062)−0.038 (0.056)
NON-WHITE−0.219 (0.057)−0.207 (0.046)−0.200 (0.047)−0.199 (0.052)−0.336 (0.085)−0.313 (0.068)
DEGREE0.159 (0.038)0.150 (0.034)0.157 (0.034)0.170 (0.035)0.237 (0.057)0.216 (0.052)
HND/A0.085 (0.030)0.097 (0.027)0.102 (0.027)0.098 (0.028)0.134 (0.045)0.151 (0.040)
O/CSE0.113 (0.026)0.103 (0.023)0.109 (0.023)0.112 (0.024)0.174 (0.039)0.149 (0.034)
HHSIZE−0.004 (0.016)−0.005 (0.015)−0.021 (0.017)−0.017 (0.019)0.0001 (0.018)−0.004 (0.016)
NCH104−0.015 (0.031)−0.001 (0.029)0.013 (0.029)0.011 (0.031)−0.026 (0.033)−0.009 (0.030)
NCH5110.074 (0.027)0.080 (0.025)0.104 (0.025)0.099 (0.027)0.092 (0.029)0.099 (0.027)
NCH12180.065 (0.027)0.066 (0.025)0.082 (0.025)0.077 (0.027)0.083 (0.031)0.086 (0.029)
AGE−0.026 (0.044)−0.031 (0.036)−0.083 (0.037)−0.091 (0.040)0.026 (0.058)0.001 (0.048)
AGE20.056 (0.136)0.073 (0.112)0.247 (0.115)0.276 (0.124)−0.095 (0.180)−0.014 (0.149)
AGE3−0.034 (0.177)−0.064 (0.145)−0.306 (0.149)−0.349 (0.162)0.159 (0.235)0.040 (0.192)
AGE4−0.005 (0.082)0.013 (0.067)0.130 (0.069)0.153 (0.075)−0.099 (0.109)−0.036 (0.089)
SAHEX(t − 1)0.812 (0.028)0.773 (0.025)0.767 (0.028)0.776 (0.032)0.388 (0.027)0.357 (0.024)
SAHFAIR−0.679 (0.023)−0.660 (0.021)−0.648 (0.025)−0.644 (0.028)−0.349 (0.024)−0.336 (0.022)
(t − 1)
SAHPOOR−1.34 (0.041)−1.32 (0.036)−1.32 (0.037)−1.31 (0.039)−0.776 (0.039)−0.766 (0.035)
(t − 1)
SAHVPOOR−1.88 (0.081)−1.87 (0.068)−1.86 (0.068)−1.83 (0.072)−1.05 (0.069)−1.07 (0.059)
(t − 1)
SAHEX(1)0.312 (0.026)0.328 (0.023)0.334 (0.023)0.325 (0.024)0.610 (0.036)0.647 (0.032)
SAHFAIR(1)−0.336 (0.024)−0.312 (0.022)−0.309 (0.023)−0.318 (0.024)−0.636 (0.040)−0.601 (0.035)
SAHPOOR(1)−0.525 (0.041)−0.512 (0.036)−0.524 (0.036)−0.538 (0.037)−1.05 (0.061)−1.06 (0.053)
SAHVPOOR(1)−0.719 (0.086)−0.672 (0.068)−0.709 (0.069)−0.749 (0.071)−1.46 (0.119)−1.43 (0.094)
Cut1−1.80 (0.519)−1.91 (0.436)−2.36 (0.429)−2.46 (0.450)−0.874 (0.722)−1.30 (0.599)
Cut2−0.803 (0.518)−0.917 (0.435)−1.37 (0.428)−1.48 (0.449)0.266 (0.721)−0.159 (0.599)
Cut30.353 (0.520)0.215 (0.435)−0.225 (0.428)0.334 (0.450)1.58 (0.722)1.14 (0.599)
Cut42.10 (0.520)1.93 (0.435)1.47 (0.430)1.36 (0.454)3.59 (0.722)3.11 (0.600)
ICC    0.308 (0.010)0.314 (0.010)
Log likelihood−24 016.7−29 893.5−30 170.1−28 688.9−23 441.0−29 223.6

The estimated coefficients for the random effects model are not directly comparable to those reported for the pooled models due to different scaling of the error variance. The pooled ordered probit assumes that the error term as a whole is distributed N(0,1) for identification of β. The random effects ordered probit restricts εit to be N(0,1), so that the overall error variance equals equation image. This implies different scaling of the estimated coefficients in the two models. However we can compare the relative effects of pairs of variables across the two models. Also, the magnitudes of the effects of individual variables can be compared by computing their average partial effects. These average partial effects are defined and tabulated below for the key variables of interest.

To formally test for state dependence we estimated dynamic models which included dummy variables representing one-period lags of the categories of the dependent variable (SAHEX(t − 1) − SAHVPOOR(t − 1)). Including state dependence is important. For the pooled models in columns (1)–(4), the estimated coefficients on the lagged categories of the dependent variable are large and highly statistically significant. There is a gradient across the estimated effects of previous health status as one moves from previous health status of very poor to excellent (the baseline category is lagged good health). Most of the coefficients, on the lagged variables and socioeconomic characteristics, are stable across the balanced and unbalanced samples and when the inverse probability weights are used to adjust for attrition.

Columns (5) and (6) introduce explicit unobserved individual heterogeneity into the dynamic model by specifying random effects. Again, these models were estimated by maximum likelihood. Allowing for heterogeneity substantially improves the fit of the model as evidenced by the change in log-likelihood. For men approximately 33% and for women approximately 31% of the latent error variance is attributable to unobserved heterogeneity, as measured by the intra-class correlation coefficient (ICC).

All of the models presented in Tables X and XI parameterize the unobserved individual effect as a function of the within-individual averages of the time-varying regressors and a vector of dummy variables to represent the first-period observations on the dependent variable.22 The estimated coefficients for the initial period observations are reported as SAHEX(1) − SAHVPOOR(1). There is a positive gradient in the estimated effects as we move from very poor to excellent initial period health. This implies that there exists a positive correlation between the initial period observations and unobserved latent health. It is striking that, comparing the random effects specifications to the pooled models, the coefficients on lagged health states are smaller and coefficients on the initial health states are larger. In fact the relative magnitudes of the effects of lagged health relative to initial health are reversed in the random effects models compared to the pooled models.

For men, educational attainment is not statistically significant but it is highly significant for women. Conditioning on the within-individual average of income (mean ln(INCOME)) renders the current income variable (ln(INCOME)) non-significant except for the random effects specifications for men and for the unbalanced sample of women. Caution should be used in interpreting the results for mean income as, in general, it is not possible to separate a causal effect of long-term economic status on health and the correlation between mean income and the unobservable individual effect. However one way of interpreting the results is to regard current income as a measure of transitory income shocks and mean income as a measure of long-term or ‘permanent’ income (see e.g. Contoyannis et al., 2004; Frijters et al., 2003 who adopt this interpretation).23

4.3. Average Partial Effects

The scaling of the ordered probit coefficients is arbitrary. To provide an indication of the magnitude of the associations between SAH and the regressors we present average partial effects (APEs). For continuous regressors, such as income, these are obtained by taking the derivative of the ordered probit probabilities with respect to the variable in question. For discrete regressors, such as lagged health state, they are obtained by taking differences. In both cases the partial effects are functions of hit−1, xit, and also the individual effect αi. One option is to compute these at Ei) = α0 + α′1i1 + α′2i. The alternative, that we adopt, is to compute average partial effects (see e.g. Wooldridge, 2002a, p. 22). In this case the partial effects are averaged over the population distribution of heterogeneity and computed using the population averaged parameters βα. In the random effects specifications these are given by equation image.24 Wooldridge (2002a) shows that computing the partial effect at the observed values of the regressors for each observation and averaging the estimates over the observations provides a consistent estimate of the APE.25

In the ordered probit model it is possible to compute APEs for each of the five categories of self-assessed health. For parsimony, Table XII summarizes the APEs of income, educational attainment and lagged health state on the probability of reporting excellent health. In this case the sign of the APE has a clear qualitative interpretation, with a positive sign implying a positive association with health and vice versa. These are presented for all six versions of the model, along with the sample standard deviations of the partial effects. Comparing the balanced and unbalanced samples gives very similar results within the pooled and random effects models, suggesting that non-response does not lead to differences in the estimated APEs. This is reinforced by the fact that the pooled models with and without inverse probability weights are virtually identical.

Table XII. Average partial effects on probability of reporting excellent health for selected variables
(a) Men
 (1)(2)(3)(4)(5)(6)
Pooled model, balanced samplePooled model, unbalanced samplePooled model, IPW-1Pooled model, IPW-2Random effects, balanced sampleRandom effects, unbalanced sample
ln(INCOME)0.009 (0.004)0.009 (0.004)0.009 (0.004)0.011 (0.005)0.013 (0.006)0.012 (0.005)
mean0.049 (0.024)0.043 (0.022)0.042 (0.021)0.045 (0.022)0.066 (0.028)0.056 (0.025)
ln(INCOME)
DEGREE0.010 (0.005)0.017 (0.009)0.018 (0.009)0.018 (0.009)0.015 (0.006)0.027 (0.012)
HND/A0.019 (0.009)0.021 (0.011)0.021 (0.010)0.022 (0.011)0.028 (0.011)0.030 (0.013)
O/CSE0.016 (0.008)0.020 (0.010)0.020 (0.010)0.020 (0.010)0.024 (0.010)0.028 (0.012)
SAHEX(t − 1)0.234 (0.087)0.231 (0.090)0.231 (0.090)0.230 (0.089)0.082 (0.031)0.085 (0.035)
SAHFAIR−0.170 (0.085)−0.163 (0.084)−0.162 (0.084)−0.162 (0.083)−0.080 (0.034)−0.077 (0.036)
(t − 1)
SAHPOOR−0.242 (0.167)−0.233 (0.163)−0.232 (0.162)−0.232 (0.162)−0.151 (0.077)−0.145 (0.078)
(t − 1)
SAHVPOOR−0.260 (0.198)−0.253 (0.197)−0.255 (0.199)−0.255 (0.200)−0.184 (0.104)−0.179 (0.106)
(t − 1)
(b) Women
 (1)(2)(3)(4)(5)(6)
Pooled model, balanced samplePooled model, unbalanced samplePooled model, IPW-1Pooled model, IPW-2Random effects, balanced sampleRandom effects, unbalanced sample
ln(INCOME)0.006 (0.004)0.007 (0.004)0.005 (0.003)0.004 (0.002)0.006 (0.003)0.008 (0.004)
mean0.028 (0.016)0.025 (0.015)0.029 (0.017)0.030 (0.017)0.039 (0.020)0.033 (0.018)
ln(INCOME)
DEGREE0.037 (0.020)0.034 (0.019)0.036 (0.020)0.039 (0.022)0.049 (0.024)0.044 (0.022)
HND/A0.019 (0.011)0.022 (0.013)0.023 (0.013)0.022 (0.013)0.027 (0.014)0.030 (0.015)
O/CSE0.026 (0.015)0.023 (0.013)0.024 (0.014)0.025 (0.015)0.035 (0.018)0.029 (0.015)
SAHEX(t − 1)0.220 (0.095)0.206 (0.092)0.205 (0.091)0.208 (0.092)0.082 (0.038)0.074 (0.035)
SAHFAIR−0.132 (0.078)−0.128 (0.076)−0.127 (0.075)−0.127 (0.074)−0.064 (0.034)−0.061 (0.033)
(t − 1)
SAHPOOR−0.185 (0.144)−0.182 (0.142)−0.183 (0.142)−0.183 (0.142)−0.121 (0.073)−0.118 (0.072)
(t − 1)
SAHVPOOR−0.201 (0.175)−0.198 (0.173)−0.199 (0.173)−0.199 (0.173)−0.144 (0.095)−0.144 (0.097)
(t − 1)

The income effects are larger for mean income (mean ln(INCOME)) than for current income (ln(INCOME)) and are a little larger in the random effects models than the pooled models. The APEs for lagged health state are noticeably lower for the random effects specifications, particularly for excellent and good previous health. This is driven by the fact that the relative magnitudes of the effects of lagged health relative to initial health are reversed in the random effects models compared to the pooled models. The APEs of educational attainment are larger in the random effects models and are larger for women than for men.

4.4. Subsample Analysis

As noted in the descriptive analysis, it appears that the dynamics of health may be influenced by age, income and educational status. To investigate this further we split the sample of males and females into subsamples based on age (≤45 and >45) at the first wave, income quartile at the first wave, and highest attained educational qualification. For each subsample we estimated a dynamic random effects ordered probit model controlling for the initial conditions and correlated effects. The results are presented in Tables XIII–XV. For both men and women the proportion of latent error variance attributable to unobserved heterogeneity is approximately the same for the age (0.32–0.34 for men and 0.31–0.33 for women), education (0.29–0.34 for men and 0.27–0.33 for women) and income (0.29–0.36 for men and 0.29–0.32 for women) subsamples compared to the full model (0.33 for men and 0.31 for women). This implies that conditional on age, education and income effects, the composition of the error variance for latent health is approximately the same across age, education and income groups.

Table XIII. Average partial effects on probability of reporting excellent health for dynamic random effects ordered probit by age group, unbalanced sample
 MENWOMEN
Age ≤ 45Age > 45Age ≤ 45Age > 45
ln(INCOME)0.003 (0.001)0.025 (0.015)0.017 (0.007)−0.006 (0.004)
mean ln(INCOME)0.074 (0.024)0.035 (0.022)0.036 (0.015)0.037 (0.026)
DEGREE0.033 (0.010)0.008 (0.005)0.048 (0.018)0.025 (0.017)
HND/A0.031 (0.010)0.023 (0.013)0.029 (0.011)0.030 (0.020)
O/CSE0.027 (0.009)0.027 (0.016)0.033 (0.013)0.020 (0.014)
SAHEX(t − 1)0.092 (0.027)0.077 (0.042)0.077 (0.028)0.070 (0.044)
SAHFAIR(t − 1)−0.077 (0.026)−0.069 (0.042)−0.063 (0.026)−0.053 (0.037)
SAHPOOR(t − 1)−0.163 (0.068)−0.116 (0.082)−0.132 (0.063)−0.092 (0.075)
SAHVPOOR(t − 1)−0.201 (0.093)−0.141 (0.109)−0.176 (0.096)−0.105 (0.091)
Table XIV. Average partial effects on probability of reporting excellent health for dynamic random effects ordered probit by educational attainment, unbalanced sample
(a) Men
 DEGREEHND/AO/CSENO QUAL
ln(INCOME)−0.007 (0.002)0.013 (0.005)0.009 (0.003)0.020 (0.012)
mean ln(INCOME)0.054 (0.016)0.078 (0.028)0.065 (0.023)0.041 (0.025)
SAHEX(t − 1)0.097 (0.025)0.103 (0.033)0.085 (0.027)0.069 (0.038)
SAHFAIR(t − 1)−0.088 (0.028)−0.097 (0.037)−0.079 (0.028)−0.061 (0.037)
SAHPOOR(t − 1)−0.138 (0.053)−0.179 (0.082)−0.158 (0.068)−0.111 (0.078)
SAHVPOOR(t − 1)−0.020 (0.006)−0.234 (0.125)−0.162 (0.074)−0.132 (0.101)
(b) Women
 DEGREEHND/AO/CSENO QUAL
ln(INCOME)0.016 (0.006)0.021 (0.008)0.001 (0.0004)0.004 (0.003)
mean ln(INCOME)0.017 (0.006)0.012 (0.005)0.070 (0.029)0.028 (0.019)
SAHEX(t − 1)0.110 (0.032)0.084 (0.028)0.080 (0.029)0.058 (0.036)
SAHFAIR(t − 1)−0.098 (0.035)−0.072 (0.028)−0.072 (0.030)−0.041 (0.028)
SAHPOOR(t − 1)−0.131 (0.053)−0.138 (0.063)−0.135 (0.066)−0.083 (0.065)
SAHVPOOR(t − 1)−0.200 (0.094)−0.180 (0.092)−0.177 (0.098)−0.097 (0.083)
Table XV. Average partial effects on probability of reporting excellent health for dynamic random effects ordered probit by income quartile, unbalanced sample
(a) Men
 1ST2ND3RD4TH
ln(INCOME)0.012 (0.007)0.011 (0.005)0.027 (0.011)−0.001 (0.0004)
mean ln(INCOME)0.036 (0.021)0.059 (0.029)0.120 (0.048)0.058 (0.021)
DEGREE0.048 (0.027)0.035 (0.016)−0.010 (0.004)0.046 (0.016)
HND/A0.042 (0.023)0.025 (0.012)0.001 (0.0002)0.046 (0.016)
O/CSE0.029 (0.017)0.036 (0.017)−0.001 (0.0004)0.045 (0.016)
SAHEX(t − 1)0.084 (0.044)0.063 (0.029)0.096 (0.034)0.092 (0.029)
SAHFAIR(t − 1)−0.062 (0.036)−0.057 (0.028)−0.081 (0.033)−0.106 (0.040)
SAHPOOR(t − 1)−0.113 (0.076)−0.109 (0.060)−0.175 (0.090)−0.186 (0.088)
SAHVPOOR(t − 1)−0.138 (0.102)−0.172 (0.107)−0.184 (0.100)−0.173 (0.084)
(b) Women
 1ST2ND3RD4TH
ln(INCOME)0.009 (0.006)0.009 (0.005)0.012 (0.006)0.001 (0.0003)
mean ln(INCOME)0.015 (0.011)0.022 (0.013)0.039 (0.019)0.024 (0.008)
DEGREE0.091 (0.053)−0.015 (0.009)0.067 (0.030)0.044 (0.014)
HND/A0.027 (0.018)0.020 (0.011)0.047 (0.022)0.023 (0.008)
O/CSE0.038 (0.025)0.012 (0.007)0.014 (0.007)0.059 (0.019)
SAHEX(t − 1)0.055 (0.034)0.080 (0.040)0.087 (0.037)0.079 (0.024)
SAHFAIR(t − 1)−0.038 (0.026)−0.065 (0.038)−0.074 (0.037)−0.065 (0.022)
SAHPOOR(t − 1)−0.072 (0.055)−0.121 (0.084)−0.134 (0.078)−0.137(0.055)
SAHVPOOR(t − 1)−0.097 (0.082)−0.130 (0.097)−0.172 (0.113)−0.155 (0.066)

Table XIII shows that the APEs for mean income are stable for women but larger for younger men than older men. The APEs for education are larger for younger people than older. The magnitude of the state dependence effects are a little smaller for older people. Tables XIV does not exhibit strong patterns in the APEs by education, although the effect of state dependence tends to be lower among those with no formal qualifications. Table XV does not reveal any clear patterns in the APEs when the sample is split into income quartiles.

5. DISCUSSION

This paper considers the determinants of a categorical indicator of self-assessed health using eight waves (1991–1998) of the British Household Panel Survey (BHPS). Previous analyses of health using BHPS (e.g. Benzeval et al., 2000) have used simple empirical models and measures of income which have not fully exploited the panel dimension of the data. Our models allow for persistence in the observed outcomes due to state dependence and unobserved individual heterogeneity. Allowing for persistence is important: comparison of the observed outcomes with a simple multinominal model shows that persistence is substantial in our dataset.

The analysis provides the following conclusions. Descriptive evidence shows that there is health-related attrition in the data, with those in very poor initial health more likely to drop out, and variable addition tests provide evidence of attrition bias in the dynamic panel data models. Nevertheless a comparison of estimates based on the balanced samples, the unbalanced samples and corrected for attrition using inverse probability weights do not show substantive differences in the average partial effects of the variables of interests. So, while health-related attrition exists, it does not appear to distort the magnitudes of the estimated effects of state dependence and socioeconomic status.26

Self-assessed health is characterized by substantial positive state dependence and unobserved permanent heterogeneity. Including state dependence dramatically reduces the impact of individual heterogeneity. Conditioning on the initial period health outcomes and within-individual averages of the exogenous variables reduces the impact of heterogeneity and state dependence. In our models unobservable heterogeneity accounts for around 30% of the unexplained variation in health.

Socioeconomic inequalities in health have been a focus of much research by economists and others, and self-assessed health has been used as the basis for broad international comparisons of socioeconomic inequalities in health (see e.g. van Doorslaer et al., 1997; van Doorslaer and Koolman, 2002). Our findings suggest that these methods could usefully be extended to incorporate analysis, based on panel data, that explicitly incorporates the contribution of heterogeneity and state dependence to the evolution of health inequalities over time.

The presence of state dependence means that short-term policy interventions designed to improve health may have longer-term implications. It has been emphasized above that caution should be used in interpreting the results for mean income as, in general, it is not possible to separate a causal effect of long-term economic status on health and the correlation between mean income and the unobservable individual effect. However, if current income is regarded as a measure of transitory income shocks and mean income as a measure of long-term or ‘permanent’ income (e.g. Contoyannis et al., 2004; Frijters et al., 2003), our results suggest that permanent income has a much greater impact on SAH than transitory income and also that the impact of permanent income is larger for men than women. Previous work has suggested that the relationship between permanent deprivation and health is substantially larger than between temporary deprivation and health (e.g. Benzeval et al., 2000). So policies aimed at long-term financial security may have more influence on health than protection against short-term fluctuations.

The association between socioeconomics status, economic deprivation and health, along with the persistence of health problems, has implications for the allocation of health care resources. For example many countries rely on weighted capitation for geographic resource allocation, capitation for reimbursement of providers, or risk adjustment formulas for setting contributions to social insurance funds. Our results suggest that risk adjustments to capitation shares should not only use conventional standardizing factors, such as age and sex, but also exploit measures of socioeconomic status and deprivation and that they may benefit from taking account of persistence in health problems.

Acknowledgements

Data from the British Household Panel Survey (BHPS) were supplied by the ESRC Data Archive. Neither the original collectors of the data nor the Archive bear any responsibility for the analysis or interpretations presented here. This paper derives from the project ‘The dynamics of income, health and inequality over the lifecycle’ (known as the ECuity III Project), which is funded in part by the European Community's Quality of Life and Management of Living Resources programme (contract QLK6-CT-2002-02297). Nigel Rice is in part supported by the UK Department of Health programme of research at the Centre for Health Economics. We are grateful to Maarten Lindeboom, John Rust and three anonymous referees for their helpful comments on an earlier version of this paper.

  • 1

    Analysis of the dynamics of income and poverty is well established (see Jenkins, 2000 for a review) but the empirical analysis of health dynamics has received little attention. The starting point for the analysis of health dynamics is Grossman's (1972, 2000) human capital model of the demand for health. The Grossman model has provided a framework for empirical studies of the demand for health and health care (e.g. van Doorslaer, 1987; Wagstaff, 1993; Salas, 2002).

  • 2

    The BHPS does include information on the reasons for leaving the survey and the reasons for refusing to be interviewed that, in principle, should allow us to distinguish between health-related attrition such as death or serious illness and other reasons for non-response. In practice this information is itself incomplete: less than 10% of the missing observations are due to known deaths and over 75% are listed as ‘don't knows’. So, in our empirical models we adopt a reduced form approach that does not distinguish between different types of attrition and we do not attempt to estimate a separate (structural) equation for mortality.

  • 3

    While, at the time of writing, eleven waves are currently available, the self-assessed health question and categories were reworded for wave 9 when the SF-36 questionnaire was included in the survey. The distribution of SAH at wave 9 is quite different from the other waves and we have confined the analysis to the first eight consecutive waves.

  • 4

    For further details see Taylor et al. (1998).

  • 5

    General evidence of non-random measurement error in self-reported health is reviewed in Currie and Madrian (1999). Crossley and Kennedy (2002) report evidence of measurement error in a 5-category SAH question. They exploit the fact that a random subsample of respondents to the 1995 Australian National Health Survey were asked the SAH question twice, before and after other morbidity questions. The first question was administered as part of the SF-36 questionnaire on a self-completion form, the second as part of a face-to-face interview on the main questionnaire. They find a statistically significant difference in the distribution of SAH between the two questions and evidence that these differences are related to age, income and occupation. This measurement error could be explained by a mode of administration effect, due to the use of self-completion and face-to-face interviews (Grootendorst et al., 1997 find evidence that self-completion questions reveal more morbidity); or a ‘framing’ or learning effect by which SAH responses are influenced by the intervening morbidity questions.

  • 6

    All available observations are used to construct each of the figures presented in the paper. Sample sizes vary, depending on the variables involved.

  • 7

    To describe the data for each sequence of outcomes would be impractical.

  • 8

    The survival rate is the ratio of observations available at wave t to the sample at wave 1. The attrition rate is the ratio of the number of drop-outs between waves t and t − 1 to the number of observations at t − 1.

  • 9

    Some caution is required in interpreting Table VI as some of the results are based on small cell sizes.

  • 10

    The partial effects are computed as marginal effects for continuous regressors and average effects for discrete regressors, evaluated at the sample means of the other regressors in the model.

  • 11

    The dynamic models are estimated using data from waves 2–8 due to the use of lagged dependent variables.

  • 12

    We do not attempt to estimate a specification that allows for both heterogeneity, state dependence and also serial correlation in εit because of the problems of separately identifying state dependence and serial correlation (see e.g. Hyslop, 1999; Contoyannis et al., 2004).

  • 13

    When the pooled model is estimated the variance of the total error term is normalized to equal one.

  • 14

    When there is heteroskedasticity in the latent variable equation, with individual-specific variances σ2i, the ordered probit model can be reformulated as a homoskedastic latent variable specification but with cut points equal to µjσi (van Doorslaer and Jones, 2003).

  • 15

    The random effects specification is estimated using the program reoprob.ado, written by Guillaume R. Frechette (Stata Technical Bulletin 59, January 2001).

  • 16

    The contribution of α2 will depend on the strength of correlation between the time-invariant regressors and αi.

  • 17

    This estimator is general in the sense that it can be applied to any problem that can be formulated as maximizing or minimizing a sample average of objective functions, which encompasses partial and quasi-ML estimators. But it can only be applied to an objective function that is additive across observations and therefore cannot be applied to the log-likelihood function for the random effects specification.

  • 18

    Selection on observables requires that zi1 contains variables that predict attrition and that are correlated with the outcome of interest (SAH) but which are deliberately excluded from the structural model (i.e. equation (1)). This contrasts with the selection on unobservables approach which seeks ‘instruments’ that are correlated with attrition but independent of the error term in (1) (see e.g. Fitzgerald et al., 1998).

  • 19

    This estimator is implemented using the pweights option in STATA.

  • 20

    Wooldridge (2002b) demonstrates this result for the case where the propensity scores are estimated using maximum likelihood binary response models, such as our probit specifications. This is counter to the usual result for two-step estimators, where adjustment for the use of fitted values usually leads to larger standard errors.

  • 21

    Experiments with more evaluation points showed a negligible difference in the log-likelihood and parameter estimates.

  • 22

    The likelihood ratio test comparing these models with the simple dynamic models with unparameterized individual unobserved heterogeneity substantially rejects exogeneity of the regressors and initial period observations.

  • 23

    Of course the assumption that income is strictly exogenous may not hold. But it is worth noting that the models do condition on ht−1 (see e.g. Adams et al., 2003).

  • 24

    In the pooled models the total error variance is normalized to 1 and the estimated βs are population-averaged parameters by default. The random effects estimates have to be re-scaled when computing the partial effects.

  • 25

    The estimates are averaged across the eight waves of the panel as well as across individuals to give a single point estimate. Wooldridge (2002a) demonstrates this result with respect to a binary y, using the law of iterated expectations to derive the result. This is easily extended to an ordered y.

  • 26

    Similar findings have been reported concerning the negligible influence of attrition bias in models of income dynamics and various labour market outcomes (see e.g. Lillard and Panis, 1998; Zabel, 1998; Ziliak and Kniesner, 1998).

Ancillary